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Abstract 

A theory of measurement uncertainty is presented, which, since it 
is based exclusively on the Bayesian approach and on the subjective 
concept of conditional probability, is applicable in the most general 
cases. 

The recent International Organization for Standardization (ISO) 
recommendation on measurement uncertainty is reobtained as the 
limit case in which linearization is meaningful and one is interested 
only in the best estimates of the quantities and in their variances. 



Introduction 

The value of a physical quantity obtained as a result of a measurement has a 
degree of uncertainty, due to unavoidable errors, of which one can recognize 
the source but never establish the exact magnitude. The uncertainty due to 
so called statistical errors is usually treated using the frequentistic concept of 
confidence intervals, although the procedure is rather unnatural and there are 
known cases (of great relevance in frontier research) in which this approach 
is not applicable. On the other hand, there is no way, within this frame, to 
handle uncertainties due to systematic errors in a consistent way. 
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Bayesian statistics, however, allows a theory of measurement uncertainty 
to be built which is applicable to all cases. The outcomes are in agreement 
with the recommendation of the Bureau International des Poids et Mesures 
(BIPM) and of the International Organization for the Standardization (ISO), 
which has also recognized the crucial role of subjective probability in assessing 
and expressing measurement uncertainty. 

In the next section I will make some remarks about the implicit use in 
science of the intuitive concept of probability as degree of belief. Then I 
will briefly discuss the part of the BIPM recommendation which deals with 
subjective probability. The Bayesian theory of uncertainty which provides 
the mathematical foundation of the recommendation will be commented 
upon. Finally I will introduce an alternative theory, based exclusively on 
the Bayesian approach and on conditional probability. More details, includ- 
ing many practical examples, can be found in PL 

Claimed frequent ism versus practiced subjectivism 

Most physicists (I deal here mainly with Physics because of personal biases, 
but the remarks and the conclusions could easily be extended to other fields 
of research) have received a scientific education in which the concept of prob- 
ability is related to the ratio of favorable over possible events, and to relative 
frequencies for the outcomes of repeated experiments. Usually the first " def- 
inition" {combinatorial) is used in theoretical calculations and the second one 
(frequentistic) in empirical evaluations. The subjective definition of proba- 
bility, as "degree of belief" , is, instead, viewed with suspicion and usually 
misunderstood. The usual criticism is that "science must be objective" and, 
hence that "there should be no room for subjectivity". Some even say: "I do 
not believe something. I assess it. This is not a matter for religion!". 

It is beyond the purposes of this paper to discuss the issue of the so called 
"objectivity" of scientific results. I would just like to remind the reader that, 
as well expressed by the science historian Galison[Q], 

"Experiments begin and end in a matrix of beliefs. . . . beliefs in in- 
strument types, in programs of experiment enquiry, in the trained, 
individual judgements about every local behavior of pieces of appara- 
tus .. . ". 

In my experience, and after interviewing many colleagues from several 
countries, physicists use (albeit unconsciously) the intuitive concept of prob- 
ability as "degree of belief , even for "professional purposes" . Nevertheless, 
they have difficulty in accepting such a definition rationally, because - in my 
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opinion - of their academic training. For example, apart from a small minor- 
ity of orthodox frequentists, almost everybody accepts statements of the kind 
"there is 90 % probability that the value of the Top quark mass is between 
. . .". In general, in fact, even the frequentistic concept of confidence interval 
is usually interpreted in a subjective way, and the correct statement (accord- 
ing to the frequentistic school) of "90 % probability that the observed value 
lies in an interval around /i" is usually turned around into a "90 % probability 
that \x is around the observed value" (// indicates hereafter the true value). 
The reason is rather simple. A physicist - to continue with our example - 
seeks to obtain some knowledge about fi and, consciously or not, wants to 
understand which values of \i have high or low degrees of belief; or which 
intervals A/x have large or small probability. A statement concerning the 
probability that a measured value falls within a certain interval around /x is 
sterile if it cannot be turned into an expression which states the quality of the 
knowledge of \x itself. Unfortunately, few scientists are aware that this can 
be done in a logically consistent way only by using the Bayes' theorem and 
some a priori degrees of belief. In practice, since one often deals with simple 
problems in which the likelihood is normal and the uniform distribution is 
a reasonable prior (in the sense that the same degree of belief is assigned 
to all the infinite values of /x) the Bayes' formula is formally "by-passed" 
and the likelihood is taken as if it described the degrees of belief for /x after 
the outcome of the experiment is known (i.e. the final probability density 
function, if /x is a continuous quantity). 

BIPM and ISO Recommendation on the measurement 
uncertainty 

An example which shows how this intuitive way of reasoning is so natural 
for the physicist can be found in the BIPM recommendation INC-1 (1980) 
about the "expression of experimental uncertainty"^. It states that 

The uncertainty in the result of a measurement generally consists of 
several components which may be grouped into two categories accord- 
ing to the way in which their numerical value is estimated: 

A: those which are evaluated by statistical methods; 
B: those which are evaluated by other means. 

Then it specifies that 

The components in category B should be characterized by quantities 
Up which may be considered as approximations to the corresponding 
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variances, the existence of which is assumed. The quantities may be 
treated like variances and the quantities Uj like standard deviations. 

Clearly, this recommendation is meaningful only in a Bayesian framework. 
In fact, the recommendation has been criticized because it is not supported 
by conventional statistics (see e.g. [|J and references therein). Nevertheless, 
it has been approved and reaffirmed by the CIPM (Comite International des 
Poids et Mesures) and adopted by ISO in its "Guide to the expression of un- 
certainty in measurement"^ and by NIST (National Institute of Standards 
and Technology) in an analogous guide 0. In particular, the ISO Guide 
recognizes the crucial role of subjective probability in Type B uncertainties: 

". . . Type B standard uncertainty is obtained from an assumed proba- 
bility density function based on the degree of belief that an event will 
occur [often called subjective probability ...]." 

"Recommendation INC-1 (1980) upon which this Guide rests implic- 
itly adopts such a viewpoint of probability ... as the appropriate way 
to calculate the combined standard uncertainty of a result of a mea- 
surement." 

The BIPM recommendation and the ISO Guide deal only with definitions and 
with "variance propagation" , performed, as usual, by linearization. A general 
theory has been proposed by Weise and W6ger[Q. which they maintain 
should provide the mathematical foundation of the Guide. Their theory 
is based on Bayesian statistics and on the principle of maximum entropy. 
Although the authors show how powerful it is in many applications, the use 
of the maximum entropy principle is, in my opinion, a weak point which 
prevents the theory from being as general as claimed (see the remarks later 
on in this paper, on the choice of the priors) and which makes the formalism 
rather complicated. I show in the next section how it is possible to build an 
alternative theory, based exclusively on probability "first principles" , which 
is very close to the physicist's intuition. In a certain sense the theory which 
will be proposed here can be seen as nothing more than a formalization of 
what most physicists unconsciously do. 

A genuine Bayesian theory of measurement uncertainty 

In the Bayesian framework inference is performed by calculating the degrees 
of belief of the true values of the physical quantities, taking into account 
all the available information. Let us call x = {x\,X2, ■ ■ ■ ,x nx } the n-tuple 
("vector") of observables, /i = {/ii, /i 2 , • • • , (J, n } the n-tuple of the true values 
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of the physical quantities of interest, and h = {hi, h 2 , . . . , h nh } the n-tuple of 
all the possible realizations of the influence variables Hi. The term "influence 
variable" is used here with an extended meaning, to indicate not only external 
factors which could influence the result (temperature, atmospheric pressure, 
etc.) but also any possible calibration constants and any source of systematic 
errors. In fact the distinction between \i and h is artificial, since they are 
all conditional hypotheses for x. We separate them simply because the aim 
of the research is to obtain knowledge about /i, while h are considered a 
nuisance. 

The likelihood of the sample x being produced from h and \x is 

f(x\ B h,H ). (1) 

H Q is intended as a reminder that likelihoods and priors - and hence conclu- 
sions - depend on all explicit and implicit assumptions within the problem, 
and, in particular, on the parametric functions used to model priors and like- 
lihoods. (To simplify the formulae, H Q will no longer be written explicitly). 
Notice that ([[D has to be meant as a function /(•]//, K) for all possible values 
of the sample x, with no restrictions beyond those given by the coherence 0. 

Using the Bayes' theorem we obtain, given an initial f (fj) which describes 
the different degrees of belief on all possible values of \x before the information 
on x is available, a final distribution f(fi) for each possible set of values of 
the influence variables h: 

f{fi\x, h) = — - - . (2) 

Notice that the integral over a probability density function (instead of 
a summation over discrete cases) is just used to simplify the notation. To 
obtain the final distribution of \x one needs to re-weig ht (D with the de grees 
of belief on h: 

,, i n J7(gb k)f (fj)f(h)dh 

IKm Jf(x\ B h)f (»)f(h)df±dh- U 

The same comment on the use of the integration, made after (□), applies 
here. Although ([|) is seldom used by physicists, the formula is conceptually 
equivalent to what experimentalists do when they vary all the parameters of 
the Monte Carlo simulation in order to estimate the "systematic error" Q 

1 Usually they are not interested in complete knowledge of f(fi) but only in best 
estimates and variances, and normality is assumed. Typical expressions one can find in 
publications, related to this procedure, are: "the following systematic checks have been 
performed" , and then "systematic errors have been added quadratically" . 
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Notice that an alternative way of getting f((x) would be to first consider 
an initial joint probability density function f (fJ,,h) and then to obtain /(/z) 
as the marginal of the final distribution Formula (|3|) is reobtained 

if n and h are independent and if f (n,h) can be factorized into f a (fj) and 
f(h). But this could be interpreted as an explicit requirement that f(/i,h) 
exists, or even that the existence of f([A,h) is needed for the assessment 
of f(x\fi,h). As stated previously, f(x\fi,h) simply describes the degree of 
belief on x for any conceivable configuration {//, h}, with no constraint other 
than coherence. This corresponds to what experimentalists do when they 
first give the result with "statistical uncertainty" only and then look for all 
possible systematic effects and evaluate their related contributions to the 
"global uncertainty" . 

Some comments about the choice of the priors 

I don't think that the problem of the prior choice is a fundamental issue. 
My view is that one should avoid pedantic discussions of the matter, because 
the idea of "universally true priors" reminds me terribly of the Byzanthine 
"angels' sex" debates. If I had to give recommendations, they would be: 

• the a priori probability should be chosen in the same spirit as the 
rational person who places a bet, seeking to minimize the risk of losing; 

• general principles may help, but, since it is difficult to apply elegant 
theoretical ideas to all practical situations, in many circumstances the 
guess of the "expert" can be relied on for guidance; 

• in particular, I think - and in this respect I completely disagree with the 
authors of [|J - there is no reason why the maximum entropy principle 
should be used in an uncertainty theory, just because it is successful in 
statistical mechanics. In my opinion, while the use of this principle in 
the case of discrete random variables is as founded as Laplace's indif- 
ference principle, in the continuous case there exists the unavoidable 
problem of the choice of the right metric ( "what is uniform in x is not 
uniform in x 2 "). It seems to me that the success of maximum entropy 
in statistical mechanics should be simply considered a lucky instance 
in which a physical scale (the Planck constant) provides the "right" 
metrics in which the phase space cells are equiprobable. 

In the following example I will use uniform and normal priors, which are 
reasonable for the problems considered. 
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An example: uncertainty due to unknown systematic 
error of the instrument scale offset 



In our scheme any influence quantity of which we do not know the exact 
value is a source of systematic error. It will change the final distribution of 
fx and hence its uncertainty. Let us take the case of the "zero" of an instru- 
ment , the value of which is never known exactly, due to limited accuracy and 
precision of the calibration. This lack of perfect knowledge can be modeled 
assuming that the zero "true value" Z is normally distributed around (i.e. 
the calibration was properly done!) with a standard deviation oz- As far 
as \i is concerned, one may attribute the same degree of belief to all of its 
possible values. We can then take a uniform distribution defined over a large 
interval, chosen according to the characteristics of the measuring device and 
to our expectation on fi. An alternative choice of vague priors could be a 
normal distribution with large variance and a reasonable average (the values 
have to be suggested by the best available knowledge of the measurand and 
of the experimental devices). For simplicity, a uniform distribution is chosen 
in this example. 

As far as f(x\fi,z) is concerned, we may assume that, for all possible 
values of n and z, the degree of belief for each value of the measured quantity 
x can be described by a normal distribution with an expected value n + z 
and variance a 1 - 



f(x\fi,z) 



2na c 



exp 



[x 



2al 



(4) 



For each z of the instrument offset we have a set of degrees of belief on \x: 



f(v\x,z) 



1 



2na 



exp 



(jM - (X- Z)f 

2al 



(5) 



Weighting f(fJ>\z) with degrees of belief on z using (|3|) we finally obtain 



f{v) = f(lA x , ■ ■ -,fo{z)) 



2-KJal + al 



exp 



(/! — X) 2 



(6) 



The result is that /(/x) is still a gaussian, but with a variance larger than that 
due only to statistical effects. The global standard deviation is the quadratic 
combination of that due to the statistical fluctuation of the data sample and 
that due to the imperfect knowledge of the systematic effect: 



cr. 



tot 



4 



(7) 
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This formula is well known and widely used, although nobody seems to care 
that it cannot be justified by conventional statistics. 

It is interesting to notice that in this framework it makes no sense to speak 
of "statistical" and "systematical" uncertainties, as if they were of a different 
nature. They are all treated probabilistically. But this requires the concept of 
probability to be related to lack of knowledge, and not simply to the outcome 
of repeated experiments. This is in agreement with the classification in Type 
A and Type B of the components of the uncertainty, recommended by the 
BIPM. 

If one has several sources of systematic errors, each related to an influence 
quantity, and such that their variations around their nominal values produce 
linear variations to the measured value, then the "usual" combination of 
variances (and covariances) is obtained (see for details). 

If several measurements are affected by the same unknown systematic er- 
ror, their results are expected to be correlated. For example, considering only 
two measured values X\ and %2 of the true values \x\ and H2, the likelihood is 



f(x 1 ,x 2 \fJ,i,fJ / 2,z) 



1 



27T<Ti<72 



exp 



;.rj - //, - zf (x 2 - H2-Z) 
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The final distribution /(/Ui,/^) is a bivariate normal distribution with ex- 
pected values x\ and x 2 - The diagonal elements of the covariance matrix are 
a i + °%i w ^h i = 1, 2. The covariance between [x\ and \i2 is an d their 
correlation factor is then 



or 



(9) 



Or. 



The correlation coefficient is positively defined, as the definition of the sys- 
tematic error considered here implies. Furthermore, as expected, several 
values influenced by the same unknown systematic error are correlated when 
the uncertainty due to the systematic error is comparable to - or larger than 
- the uncertainties due to sampling effects alone. 



Conclusions 

Bayesian statistics is closer to the physicist's mentality and needs than one 
may naively think. A Bayesian theory of measurement uncertainty has the 
simple and important role of formalizing what is often done, more or less 
intuitively, by experimentalists in simple cases, and to give guidance in more 
complex situations. 



8 



As far as the choice of the priors and the interpretation of conditional 
probability are concerned, it seems to me that, although it may look para- 
doxical at first sight, the "subjective" approach (d la de Finetti) has the best 
chance of achieving consensus among the scientific community (after some 
initial resistance due to cultural prejudices). 
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